Overview

Brought to you by YData

Dataset statistics

Number of variables17
Number of observations459347
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory314.9 MiB
Average record size in memory718.7 B

Variable types

Numeric6
Text5
Categorical5
DateTime1

Alerts

mmr is highly overall correlated with odometer and 2 other fieldsHigh correlation
odometer is highly overall correlated with mmr and 2 other fieldsHigh correlation
sellingprice is highly overall correlated with mmr and 2 other fieldsHigh correlation
year is highly overall correlated with mmr and 2 other fieldsHigh correlation
body is highly imbalanced (52.2%)Imbalance
transmission is highly imbalanced (78.5%)Imbalance
interior is highly imbalanced (51.4%)Imbalance
Unnamed: 0 has unique valuesUnique
vin has unique valuesUnique

Reproduction

Analysis started2024-11-14 01:41:09.016466
Analysis finished2024-11-14 01:41:22.315089
Duration13.3 seconds
Software versionydata-profiling v0.0.dev0
Download configurationconfig.json

Variables

Unnamed: 0
Real number (ℝ)

UNIQUE 

Distinct459347
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean284634.94
Minimum0
Maximum558836
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size7.0 MiB
2024-11-13T20:41:22.352963image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile37951.6
Q1148497.5
median284795
Q3421203.5
95-th percentile531202.7
Maximum558836
Range558836
Interquartile range (IQR)272706

Descriptive statistics

Standard deviation158181.92
Coefficient of variation (CV)0.55573613
Kurtosis-1.1767618
Mean284634.94
Median Absolute Deviation (MAD)136354
Skewness-0.013047283
Sum1.3074621 × 1011
Variance2.502152 × 1010
MonotonicityStrictly increasing
2024-11-13T20:41:22.419740image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 1
 
< 0.1%
375652 1
 
< 0.1%
375474 1
 
< 0.1%
375473 1
 
< 0.1%
375472 1
 
< 0.1%
375471 1
 
< 0.1%
375470 1
 
< 0.1%
375469 1
 
< 0.1%
375468 1
 
< 0.1%
375467 1
 
< 0.1%
Other values (459337) 459337
> 99.9%
ValueCountFrequency (%)
0 1
< 0.1%
1 1
< 0.1%
2 1
< 0.1%
3 1
< 0.1%
4 1
< 0.1%
5 1
< 0.1%
6 1
< 0.1%
7 1
< 0.1%
8 1
< 0.1%
9 1
< 0.1%
ValueCountFrequency (%)
558836 1
< 0.1%
558835 1
< 0.1%
558834 1
< 0.1%
558833 1
< 0.1%
558831 1
< 0.1%
558828 1
< 0.1%
558827 1
< 0.1%
558826 1
< 0.1%
558825 1
< 0.1%
558824 1
< 0.1%

year
Real number (ℝ)

HIGH CORRELATION 

Distinct26
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2010.2225
Minimum1990
Maximum2015
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.0 MiB
2024-11-13T20:41:22.481533image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

Quantile statistics

Minimum1990
5-th percentile2003
Q12008
median2012
Q32013
95-th percentile2014
Maximum2015
Range25
Interquartile range (IQR)5

Descriptive statistics

Standard deviation3.8341173
Coefficient of variation (CV)0.0019073099
Kurtosis1.0581366
Mean2010.2225
Median Absolute Deviation (MAD)2
Skewness-1.215319
Sum9.2338968 × 108
Variance14.700456
MonotonicityNot monotonic
2024-11-13T20:41:22.540336image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Histogram with fixed size bins (bins=26)
ValueCountFrequency (%)
2013 85553
18.6%
2012 85176
18.5%
2014 68540
14.9%
2011 39984
8.7%
2008 25853
 
5.6%
2007 24351
 
5.3%
2010 21693
 
4.7%
2006 20809
 
4.5%
2009 17178
 
3.7%
2005 16637
 
3.6%
Other values (16) 53573
11.7%
ValueCountFrequency (%)
1990 33
 
< 0.1%
1991 51
 
< 0.1%
1992 95
 
< 0.1%
1993 127
 
< 0.1%
1994 286
 
0.1%
1995 477
 
0.1%
1996 562
 
0.1%
1997 1030
0.2%
1998 1446
0.3%
1999 2187
0.5%
ValueCountFrequency (%)
2015 7904
 
1.7%
2014 68540
14.9%
2013 85553
18.6%
2012 85176
18.5%
2011 39984
8.7%
2010 21693
 
4.7%
2009 17178
 
3.7%
2008 25853
 
5.6%
2007 24351
 
5.3%
2006 20809
 
4.5%

make
Text

Distinct53
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size31.1 MiB
2024-11-13T20:41:22.624056image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

Length

Max length13
Median length11
Mean length5.9949951
Min length3

Characters and Unicode

Total characters2753783
Distinct characters27
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowKIA
2nd rowKIA
3rd rowBMW
4th rowVOLVO
5th rowBMW
ValueCountFrequency (%)
ford 78858
17.1%
chevrolet 52580
 
11.4%
nissan 43128
 
9.4%
toyota 34463
 
7.5%
dodge 26364
 
5.7%
honda 24165
 
5.2%
hyundai 18286
 
4.0%
bmw 16880
 
3.7%
kia 15541
 
3.4%
chrysler 14758
 
3.2%
Other values (45) 135610
29.4%
2024-11-13T20:41:22.770507image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
O 280617
 
10.2%
E 250663
 
9.1%
D 206496
 
7.5%
A 205369
 
7.5%
N 201752
 
7.3%
R 198602
 
7.2%
I 175681
 
6.4%
S 153970
 
5.6%
T 146526
 
5.3%
C 123359
 
4.5%
Other values (17) 810748
29.4%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 2739006
99.5%
Dash Punctuation 13491
 
0.5%
Space Separator 1286
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
O 280617
 
10.2%
E 250663
 
9.2%
D 206496
 
7.5%
A 205369
 
7.5%
N 201752
 
7.4%
R 198602
 
7.3%
I 175681
 
6.4%
S 153970
 
5.6%
T 146526
 
5.3%
C 123359
 
4.5%
Other values (15) 795971
29.1%
Dash Punctuation
ValueCountFrequency (%)
- 13491
100.0%
Space Separator
ValueCountFrequency (%)
1286
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2739006
99.5%
Common 14777
 
0.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
O 280617
 
10.2%
E 250663
 
9.2%
D 206496
 
7.5%
A 205369
 
7.5%
N 201752
 
7.4%
R 198602
 
7.3%
I 175681
 
6.4%
S 153970
 
5.6%
T 146526
 
5.3%
C 123359
 
4.5%
Other values (15) 795971
29.1%
Common
ValueCountFrequency (%)
- 13491
91.3%
1286
 
8.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2753783
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
O 280617
 
10.2%
E 250663
 
9.1%
D 206496
 
7.5%
A 205369
 
7.5%
N 201752
 
7.3%
R 198602
 
7.2%
I 175681
 
6.4%
S 153970
 
5.6%
T 146526
 
5.3%
C 123359
 
4.5%
Other values (17) 810748
29.4%

model
Text

Distinct764
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size31.4 MiB
2024-11-13T20:41:22.912034image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

Length

Max length29
Median length23
Mean length6.7515647
Min length1

Characters and Unicode

Total characters3101311
Distinct characters39
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique28 ?
Unique (%)< 0.1%

Sample

1st rowSORENTO
2nd rowSORENTO
3rd row3 SERIES
4th rowS60
5th row6 SERIES GRAN COUPE
ValueCountFrequency (%)
altima 16192
 
2.9%
series 12888
 
2.3%
fusion 12516
 
2.2%
grand 12226
 
2.2%
1500 12169
 
2.2%
camry 11651
 
2.1%
f-150 11561
 
2.1%
escape 10542
 
1.9%
focus 9403
 
1.7%
g 8594
 
1.5%
Other values (669) 442662
79.0%
2024-11-13T20:41:23.124324image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 358099
 
11.5%
E 272660
 
8.8%
R 255538
 
8.2%
S 212136
 
6.8%
O 174142
 
5.6%
C 170126
 
5.5%
I 156903
 
5.1%
N 156716
 
5.1%
T 143788
 
4.6%
L 128808
 
4.2%
Other values (29) 1072395
34.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 2751419
88.7%
Decimal Number 211659
 
6.8%
Space Separator 101057
 
3.3%
Dash Punctuation 37082
 
1.2%
Other Punctuation 94
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 358099
13.0%
E 272660
 
9.9%
R 255538
 
9.3%
S 212136
 
7.7%
O 174142
 
6.3%
C 170126
 
6.2%
I 156903
 
5.7%
N 156716
 
5.7%
T 143788
 
5.2%
L 128808
 
4.7%
Other values (16) 722503
26.3%
Decimal Number
ValueCountFrequency (%)
0 78170
36.9%
5 48314
22.8%
3 25898
 
12.2%
1 24973
 
11.8%
2 11968
 
5.7%
4 9242
 
4.4%
6 7213
 
3.4%
7 3218
 
1.5%
9 1908
 
0.9%
8 755
 
0.4%
Space Separator
ValueCountFrequency (%)
101057
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 37082
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 94
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2751419
88.7%
Common 349892
 
11.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 358099
13.0%
E 272660
 
9.9%
R 255538
 
9.3%
S 212136
 
7.7%
O 174142
 
6.3%
C 170126
 
6.2%
I 156903
 
5.7%
N 156716
 
5.7%
T 143788
 
5.2%
L 128808
 
4.7%
Other values (16) 722503
26.3%
Common
ValueCountFrequency (%)
101057
28.9%
0 78170
22.3%
5 48314
13.8%
- 37082
 
10.6%
3 25898
 
7.4%
1 24973
 
7.1%
2 11968
 
3.4%
4 9242
 
2.6%
6 7213
 
2.1%
7 3218
 
0.9%
Other values (3) 2757
 
0.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3101311
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 358099
 
11.5%
E 272660
 
8.8%
R 255538
 
8.2%
S 212136
 
6.8%
O 174142
 
5.6%
C 170126
 
5.5%
I 156903
 
5.1%
N 156716
 
5.1%
T 143788
 
4.6%
L 128808
 
4.2%
Other values (29) 1072395
34.6%

trim
Text

Distinct1474
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size30.5 MiB
2024-11-13T20:41:23.296254image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

Length

Max length46
Median length37
Mean length4.6401021
Min length1

Characters and Unicode

Total characters2131417
Distinct characters46
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique107 ?
Unique (%)< 0.1%

Sample

1st rowLX
2nd rowLX
3rd row328I SULEV
4th rowT5
5th row650I
ValueCountFrequency (%)
base 47437
 
8.5%
se 42514
 
7.6%
s 24910
 
4.5%
lx 18888
 
3.4%
lt 17804
 
3.2%
limited 16840
 
3.0%
2.5 15720
 
2.8%
ls 15529
 
2.8%
xlt 14744
 
2.6%
sport 14573
 
2.6%
Other values (840) 328638
58.9%
2024-11-13T20:41:23.536573image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
E 236801
 
11.1%
S 236432
 
11.1%
L 200284
 
9.4%
T 165312
 
7.8%
I 116773
 
5.5%
98250
 
4.6%
A 96253
 
4.5%
X 90572
 
4.2%
R 89134
 
4.2%
5 59231
 
2.8%
Other values (36) 742375
34.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1728110
81.1%
Decimal Number 244078
 
11.5%
Space Separator 98250
 
4.6%
Other Punctuation 42376
 
2.0%
Dash Punctuation 16907
 
0.8%
Math Symbol 1592
 
0.1%
Open Punctuation 52
 
< 0.1%
Close Punctuation 52
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 236801
13.7%
S 236432
13.7%
L 200284
11.6%
T 165312
9.6%
I 116773
 
6.8%
A 96253
 
5.6%
X 90572
 
5.2%
R 89134
 
5.2%
U 53561
 
3.1%
B 52521
 
3.0%
Other values (16) 390467
22.6%
Decimal Number
ValueCountFrequency (%)
5 59231
24.3%
2 46618
19.1%
3 41462
17.0%
0 37815
15.5%
1 15093
 
6.2%
7 11646
 
4.8%
8 11459
 
4.7%
6 10651
 
4.4%
4 9848
 
4.0%
9 255
 
0.1%
Other Punctuation
ValueCountFrequency (%)
. 39882
94.1%
/ 2077
 
4.9%
! 383
 
0.9%
' 28
 
0.1%
: 6
 
< 0.1%
Space Separator
ValueCountFrequency (%)
98250
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 16907
100.0%
Math Symbol
ValueCountFrequency (%)
+ 1592
100.0%
Open Punctuation
ValueCountFrequency (%)
( 52
100.0%
Close Punctuation
ValueCountFrequency (%)
) 52
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1728110
81.1%
Common 403307
 
18.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 236801
13.7%
S 236432
13.7%
L 200284
11.6%
T 165312
9.6%
I 116773
 
6.8%
A 96253
 
5.6%
X 90572
 
5.2%
R 89134
 
5.2%
U 53561
 
3.1%
B 52521
 
3.0%
Other values (16) 390467
22.6%
Common
ValueCountFrequency (%)
98250
24.4%
5 59231
14.7%
2 46618
11.6%
3 41462
10.3%
. 39882
9.9%
0 37815
 
9.4%
- 16907
 
4.2%
1 15093
 
3.7%
7 11646
 
2.9%
8 11459
 
2.8%
Other values (10) 24944
 
6.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2131417
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 236801
 
11.1%
S 236432
 
11.1%
L 200284
 
9.4%
T 165312
 
7.8%
I 116773
 
5.5%
98250
 
4.6%
A 96253
 
4.5%
X 90572
 
4.2%
R 89134
 
4.2%
5 59231
 
2.8%
Other values (36) 742375
34.8%

body
Categorical

IMBALANCE 

Distinct45
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size30.8 MiB
SEDAN
206429 
SUV
117229 
HATCHBACK
23184 
MINIVAN
21429 
COUPE
 
15357
Other values (40)
75719 

Length

Max length23
Median length5
Mean length5.2971174
Min length3

Characters and Unicode

Total characters2433215
Distinct characters29
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowSUV
2nd rowSUV
3rd rowSEDAN
4th rowSEDAN
5th rowSEDAN

Common Values

ValueCountFrequency (%)
SEDAN 206429
44.9%
SUV 117229
25.5%
HATCHBACK 23184
 
5.0%
MINIVAN 21429
 
4.7%
COUPE 15357
 
3.3%
WAGON 13816
 
3.0%
CREW CAB 13627
 
3.0%
CONVERTIBLE 8935
 
1.9%
SUPERCREW 7278
 
1.6%
G SEDAN 6812
 
1.5%
Other values (35) 25251
 
5.5%

Length

2024-11-13T20:41:23.610327image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
sedan 213241
42.9%
suv 117229
23.6%
cab 27572
 
5.5%
hatchback 23184
 
4.7%
minivan 21429
 
4.3%
coupe 17356
 
3.5%
wagon 13856
 
2.8%
crew 13627
 
2.7%
convertible 9357
 
1.9%
g 8594
 
1.7%
Other values (32) 32014
 
6.4%

Most occurring characters

ValueCountFrequency (%)
S 345276
14.2%
A 339653
14.0%
E 303283
12.5%
N 288531
11.9%
D 225679
9.3%
U 154964
6.4%
V 152657
6.3%
C 126946
 
5.2%
B 65787
 
2.7%
I 54009
 
2.2%
Other values (19) 376430
15.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 2393741
98.4%
Space Separator 38112
 
1.6%
Dash Punctuation 1145
 
< 0.1%
Decimal Number 217
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 345276
14.4%
A 339653
14.2%
E 303283
12.7%
N 288531
12.1%
D 225679
9.4%
U 154964
6.5%
V 152657
6.4%
C 126946
 
5.3%
B 65787
 
2.7%
I 54009
 
2.3%
Other values (12) 336956
14.1%
Decimal Number
ValueCountFrequency (%)
6 76
35.0%
0 76
35.0%
3 30
 
13.8%
7 30
 
13.8%
4 5
 
2.3%
Space Separator
ValueCountFrequency (%)
38112
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1145
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2393741
98.4%
Common 39474
 
1.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 345276
14.4%
A 339653
14.2%
E 303283
12.7%
N 288531
12.1%
D 225679
9.4%
U 154964
6.5%
V 152657
6.4%
C 126946
 
5.3%
B 65787
 
2.7%
I 54009
 
2.3%
Other values (12) 336956
14.1%
Common
ValueCountFrequency (%)
38112
96.5%
- 1145
 
2.9%
6 76
 
0.2%
0 76
 
0.2%
3 30
 
0.1%
7 30
 
0.1%
4 5
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2433215
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
S 345276
14.2%
A 339653
14.0%
E 303283
12.5%
N 288531
11.9%
D 225679
9.3%
U 154964
6.4%
V 152657
6.3%
C 126946
 
5.2%
B 65787
 
2.7%
I 54009
 
2.2%
Other values (19) 376430
15.5%

transmission
Categorical

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size32.4 MiB
AUTOMATIC
443636 
MANUAL
 
15711

Length

Max length9
Median length9
Mean length8.8973913
Min length6

Characters and Unicode

Total characters4086990
Distinct characters9
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAUTOMATIC
2nd rowAUTOMATIC
3rd rowAUTOMATIC
4th rowAUTOMATIC
5th rowAUTOMATIC

Common Values

ValueCountFrequency (%)
AUTOMATIC 443636
96.6%
MANUAL 15711
 
3.4%

Length

2024-11-13T20:41:23.675110image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-11-13T20:41:23.720957image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
ValueCountFrequency (%)
automatic 443636
96.6%
manual 15711
 
3.4%

Most occurring characters

ValueCountFrequency (%)
A 918694
22.5%
T 887272
21.7%
U 459347
11.2%
M 459347
11.2%
O 443636
10.9%
I 443636
10.9%
C 443636
10.9%
N 15711
 
0.4%
L 15711
 
0.4%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 4086990
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 918694
22.5%
T 887272
21.7%
U 459347
11.2%
M 459347
11.2%
O 443636
10.9%
I 443636
10.9%
C 443636
10.9%
N 15711
 
0.4%
L 15711
 
0.4%

Most occurring scripts

ValueCountFrequency (%)
Latin 4086990
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 918694
22.5%
T 887272
21.7%
U 459347
11.2%
M 459347
11.2%
O 443636
10.9%
I 443636
10.9%
C 443636
10.9%
N 15711
 
0.4%
L 15711
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4086990
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 918694
22.5%
T 887272
21.7%
U 459347
11.2%
M 459347
11.2%
O 443636
10.9%
I 443636
10.9%
C 443636
10.9%
N 15711
 
0.4%
L 15711
 
0.4%

vin
Text

UNIQUE 

Distinct459347
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size35.9 MiB
2024-11-13T20:41:23.910836image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

Length

Max length17
Median length17
Mean length17
Min length17

Characters and Unicode

Total characters7808899
Distinct characters33
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique459347 ?
Unique (%)100.0%

Sample

1st row5xyktca69fg566472
2nd row5xyktca69fg561319
3rd rowwba3c1c51ek116351
4th rowyv1612tb4f1310987
5th rowwba6b2c57ed129731
ValueCountFrequency (%)
5xyktca69fg566472 1
 
< 0.1%
knagm4ad8d5056639 1
 
< 0.1%
yv1612tb4f1310987 1
 
< 0.1%
wba6b2c57ed129731 1
 
< 0.1%
1n4al3ap1fn326013 1
 
< 0.1%
wbsfv9c51ed593089 1
 
< 0.1%
1g1pc5sb2e7128460 1
 
< 0.1%
wauffafl3en030343 1
 
< 0.1%
2g1fb3d37e9218789 1
 
< 0.1%
wauhgafc0en062916 1
 
< 0.1%
Other values (459337) 459337
> 99.9%
2024-11-13T20:41:24.169970image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 757751
 
9.7%
2 523213
 
6.7%
3 504651
 
6.5%
5 489867
 
6.3%
4 468123
 
6.0%
0 409790
 
5.2%
6 399840
 
5.1%
7 376083
 
4.8%
8 370564
 
4.7%
c 317882
 
4.1%
Other values (23) 3191135
40.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4609856
59.0%
Lowercase Letter 3199043
41.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
c 317882
 
9.9%
a 296915
 
9.3%
d 236327
 
7.4%
f 233161
 
7.3%
b 220594
 
6.9%
e 201666
 
6.3%
g 193397
 
6.0%
n 157669
 
4.9%
k 132047
 
4.1%
h 125435
 
3.9%
Other values (13) 1083950
33.9%
Decimal Number
ValueCountFrequency (%)
1 757751
16.4%
2 523213
11.3%
3 504651
10.9%
5 489867
10.6%
4 468123
10.2%
0 409790
8.9%
6 399840
8.7%
7 376083
8.2%
8 370564
8.0%
9 309974
6.7%

Most occurring scripts

ValueCountFrequency (%)
Common 4609856
59.0%
Latin 3199043
41.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
c 317882
 
9.9%
a 296915
 
9.3%
d 236327
 
7.4%
f 233161
 
7.3%
b 220594
 
6.9%
e 201666
 
6.3%
g 193397
 
6.0%
n 157669
 
4.9%
k 132047
 
4.1%
h 125435
 
3.9%
Other values (13) 1083950
33.9%
Common
ValueCountFrequency (%)
1 757751
16.4%
2 523213
11.3%
3 504651
10.9%
5 489867
10.6%
4 468123
10.2%
0 409790
8.9%
6 399840
8.7%
7 376083
8.2%
8 370564
8.0%
9 309974
6.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7808899
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 757751
 
9.7%
2 523213
 
6.7%
3 504651
 
6.5%
5 489867
 
6.3%
4 468123
 
6.0%
0 409790
 
5.2%
6 399840
 
5.1%
7 376083
 
4.8%
8 370564
 
4.7%
c 317882
 
4.1%
Other values (23) 3191135
40.9%

state
Categorical

Distinct34
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size29.4 MiB
FL
73405 
CA
63579 
TX
40393 
GA
29761 
PA
23522 
Other values (29)
228687 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters918694
Distinct characters24
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCA
2nd rowCA
3rd rowCA
4th rowCA
5th rowCA

Common Values

ValueCountFrequency (%)
FL 73405
16.0%
CA 63579
13.8%
TX 40393
 
8.8%
GA 29761
 
6.5%
PA 23522
 
5.1%
NJ 22578
 
4.9%
IL 21144
 
4.6%
OH 19771
 
4.3%
TN 18476
 
4.0%
NC 18040
 
3.9%
Other values (24) 128678
28.0%

Length

2024-11-13T20:41:24.235750image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
fl 73405
16.0%
ca 63579
13.8%
tx 40393
 
8.8%
ga 29761
 
6.5%
pa 23522
 
5.1%
nj 22578
 
4.9%
il 21144
 
4.6%
oh 19771
 
4.3%
tn 18476
 
4.0%
nc 18040
 
3.9%
Other values (24) 128678
28.0%

Most occurring characters

ValueCountFrequency (%)
A 148225
16.1%
L 96291
10.5%
C 91505
10.0%
N 91012
9.9%
F 73405
8.0%
T 60550
 
6.6%
M 54874
 
6.0%
I 49027
 
5.3%
O 41983
 
4.6%
X 40393
 
4.4%
Other values (14) 171429
18.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 918694
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 148225
16.1%
L 96291
10.5%
C 91505
10.0%
N 91012
9.9%
F 73405
8.0%
T 60550
 
6.6%
M 54874
 
6.0%
I 49027
 
5.3%
O 41983
 
4.6%
X 40393
 
4.4%
Other values (14) 171429
18.7%

Most occurring scripts

ValueCountFrequency (%)
Latin 918694
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 148225
16.1%
L 96291
10.5%
C 91505
10.0%
N 91012
9.9%
F 73405
8.0%
T 60550
 
6.6%
M 54874
 
6.0%
I 49027
 
5.3%
O 41983
 
4.6%
X 40393
 
4.4%
Other values (14) 171429
18.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 918694
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 148225
16.1%
L 96291
10.5%
C 91505
10.0%
N 91012
9.9%
F 73405
8.0%
T 60550
 
6.6%
M 54874
 
6.0%
I 49027
 
5.3%
O 41983
 
4.6%
X 40393
 
4.4%
Other values (14) 171429
18.7%

condition
Real number (ℝ)

Distinct41
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean30.787189
Minimum1
Maximum49
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.0 MiB
2024-11-13T20:41:24.291563image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3
Q124
median35
Q342
95-th percentile47
Maximum49
Range48
Interquartile range (IQR)18

Descriptive statistics

Standard deviation13.307967
Coefficient of variation (CV)0.43225663
Kurtosis-0.18702975
Mean30.787189
Median Absolute Deviation (MAD)8
Skewness-0.8428269
Sum14142003
Variance177.10198
MonotonicityNot monotonic
2024-11-13T20:41:24.356347image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Histogram with fixed size bins (bins=41)
ValueCountFrequency (%)
19 35669
 
7.8%
35 22454
 
4.9%
37 21986
 
4.8%
44 21579
 
4.7%
43 21095
 
4.6%
42 20604
 
4.5%
36 19571
 
4.3%
41 19305
 
4.2%
39 17119
 
3.7%
2 16741
 
3.6%
Other values (31) 243224
52.9%
ValueCountFrequency (%)
1 5688
 
1.2%
2 16741
3.6%
3 8906
1.9%
4 16737
3.6%
5 9230
2.0%
11 77
 
< 0.1%
12 82
 
< 0.1%
13 71
 
< 0.1%
14 111
 
< 0.1%
15 113
 
< 0.1%
ValueCountFrequency (%)
49 10850
2.4%
48 10683
2.3%
47 9543
2.1%
46 10594
2.3%
45 10440
2.3%
44 21579
4.7%
43 21095
4.6%
42 20604
4.5%
41 19305
4.2%
39 17119
3.7%

odometer
Real number (ℝ)

HIGH CORRELATION 

Distinct159073
Distinct (%)34.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean66644.169
Minimum1
Maximum999999
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.0 MiB
2024-11-13T20:41:24.424120image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile10397
Q127947.5
median50683
Q396674
95-th percentile166075.8
Maximum999999
Range999998
Interquartile range (IQR)68726.5

Descriptive statistics

Standard deviation52181.19
Coefficient of variation (CV)0.78298207
Kurtosis15.151132
Mean66644.169
Median Absolute Deviation (MAD)29180
Skewness1.9277255
Sum3.0612799 × 1010
Variance2.7228766 × 109
MonotonicityNot monotonic
2024-11-13T20:41:24.495880image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1 845
 
0.2%
999999 60
 
< 0.1%
10 24
 
< 0.1%
24023 17
 
< 0.1%
33995 17
 
< 0.1%
21587 17
 
< 0.1%
35888 16
 
< 0.1%
21310 16
 
< 0.1%
29850 16
 
< 0.1%
36265 16
 
< 0.1%
Other values (159063) 458303
99.8%
ValueCountFrequency (%)
1 845
0.2%
2 10
 
< 0.1%
3 5
 
< 0.1%
4 7
 
< 0.1%
5 12
 
< 0.1%
6 12
 
< 0.1%
7 12
 
< 0.1%
8 15
 
< 0.1%
9 9
 
< 0.1%
10 24
 
< 0.1%
ValueCountFrequency (%)
999999 60
< 0.1%
980113 1
 
< 0.1%
959276 1
 
< 0.1%
694978 2
 
< 0.1%
621388 1
 
< 0.1%
537334 1
 
< 0.1%
522212 1
 
< 0.1%
495757 1
 
< 0.1%
480747 1
 
< 0.1%
471114 1
 
< 0.1%

color
Categorical

Distinct20
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size31.1 MiB
BLACK
90446 
WHITE
86860 
SILVER
69167 
GRAY
68734 
BLUE
41985 
Other values (15)
102155 

Length

Max length9
Median length8
Mean length4.6142023
Min length1

Characters and Unicode

Total characters2119520
Distinct characters25
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowWHITE
2nd rowWHITE
3rd rowGRAY
4th rowWHITE
5th rowGRAY

Common Values

ValueCountFrequency (%)
BLACK 90446
19.7%
WHITE 86860
18.9%
SILVER 69167
15.1%
GRAY 68734
15.0%
BLUE 41985
9.1%
RED 36278
7.9%
— 21678
 
4.7%
GOLD 9266
 
2.0%
GREEN 9038
 
2.0%
BURGUNDY 7402
 
1.6%
Other values (10) 18493
 
4.0%

Length

2024-11-13T20:41:24.563653image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
black 90446
19.7%
white 86860
18.9%
silver 69167
15.1%
gray 68734
15.0%
blue 41985
9.1%
red 36278
7.9%
— 21678
 
4.7%
gold 9266
 
2.0%
green 9038
 
2.0%
burgundy 7402
 
1.6%
Other values (10) 18493
 
4.0%

Most occurring characters

ValueCountFrequency (%)
E 272173
12.8%
L 214561
 
10.1%
R 199605
 
9.4%
I 164698
 
7.8%
A 161704
 
7.6%
B 152539
 
7.2%
G 103368
 
4.9%
W 94546
 
4.5%
C 91280
 
4.3%
K 90485
 
4.3%
Other values (15) 574561
27.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 2096645
98.9%
Dash Punctuation 22875
 
1.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 272173
13.0%
L 214561
 
10.2%
R 199605
 
9.5%
I 164698
 
7.9%
A 161704
 
7.7%
B 152539
 
7.3%
G 103368
 
4.9%
W 94546
 
4.5%
C 91280
 
4.4%
K 90485
 
4.3%
Other values (13) 551686
26.3%
Dash Punctuation
ValueCountFrequency (%)
— 21678
94.8%
- 1197
 
5.2%

Most occurring scripts

ValueCountFrequency (%)
Latin 2096645
98.9%
Common 22875
 
1.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 272173
13.0%
L 214561
 
10.2%
R 199605
 
9.5%
I 164698
 
7.9%
A 161704
 
7.7%
B 152539
 
7.3%
G 103368
 
4.9%
W 94546
 
4.5%
C 91280
 
4.4%
K 90485
 
4.3%
Other values (13) 551686
26.3%
Common
ValueCountFrequency (%)
— 21678
94.8%
- 1197
 
5.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2097842
99.0%
Punctuation 21678
 
1.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E 272173
13.0%
L 214561
 
10.2%
R 199605
 
9.5%
I 164698
 
7.9%
A 161704
 
7.7%
B 152539
 
7.3%
G 103368
 
4.9%
W 94546
 
4.5%
C 91280
 
4.4%
K 90485
 
4.3%
Other values (14) 552883
26.4%
Punctuation
ValueCountFrequency (%)
— 21678
100.0%

interior
Categorical

IMBALANCE 

Distinct17
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size30.7 MiB
BLACK
204136 
GRAY
148477 
BEIGE
49038 
TAN
36652 
—
 
9646
Other values (12)
 
11398

Length

Max length9
Median length5
Mean length4.4329037
Min length1

Characters and Unicode

Total characters2036241
Distinct characters23
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowBLACK
2nd rowBEIGE
3rd rowBLACK
4th rowBLACK
5th rowBLACK

Common Values

ValueCountFrequency (%)
BLACK 204136
44.4%
GRAY 148477
32.3%
BEIGE 49038
 
10.7%
TAN 36652
 
8.0%
— 9646
 
2.1%
BROWN 6869
 
1.5%
RED 1088
 
0.2%
SILVER 953
 
0.2%
BLUE 894
 
0.2%
OFF-WHITE 350
 
0.1%
Other values (7) 1244
 
0.3%

Length

2024-11-13T20:41:24.629433image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
black 204136
44.4%
gray 148477
32.3%
beige 49038
 
10.7%
tan 36652
 
8.0%
— 9646
 
2.1%
brown 6869
 
1.5%
red 1088
 
0.2%
silver 953
 
0.2%
blue 894
 
0.2%
off-white 350
 
0.1%
Other values (7) 1244
 
0.3%

Most occurring characters

ValueCountFrequency (%)
A 389382
19.1%
B 261088
12.8%
L 206574
10.1%
C 204136
10.0%
K 204136
10.0%
G 198258
9.7%
R 158129
7.8%
Y 148647
 
7.3%
E 102375
 
5.0%
I 50547
 
2.5%
Other values (13) 112969
 
5.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 2026245
99.5%
Dash Punctuation 9996
 
0.5%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 389382
19.2%
B 261088
12.9%
L 206574
10.2%
C 204136
10.1%
K 204136
10.1%
G 198258
9.8%
R 158129
7.8%
Y 148647
 
7.3%
E 102375
 
5.1%
I 50547
 
2.5%
Other values (11) 102973
 
5.1%
Dash Punctuation
ValueCountFrequency (%)
— 9646
96.5%
- 350
 
3.5%

Most occurring scripts

ValueCountFrequency (%)
Latin 2026245
99.5%
Common 9996
 
0.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 389382
19.2%
B 261088
12.9%
L 206574
10.2%
C 204136
10.1%
K 204136
10.1%
G 198258
9.8%
R 158129
7.8%
Y 148647
 
7.3%
E 102375
 
5.1%
I 50547
 
2.5%
Other values (11) 102973
 
5.1%
Common
ValueCountFrequency (%)
— 9646
96.5%
- 350
 
3.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2026595
99.5%
Punctuation 9646
 
0.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 389382
19.2%
B 261088
12.9%
L 206574
10.2%
C 204136
10.1%
K 204136
10.1%
G 198258
9.8%
R 158129
7.8%
Y 148647
 
7.3%
E 102375
 
5.1%
I 50547
 
2.5%
Other values (12) 103323
 
5.1%
Punctuation
ValueCountFrequency (%)
— 9646
100.0%

seller
Text

Distinct11714
Distinct (%)2.6%
Missing0
Missing (%)0.0%
Memory size38.6 MiB
2024-11-13T20:41:24.772954image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

Length

Max length50
Median length42
Mean length23.058749
Min length3

Characters and Unicode

Total characters10591967
Distinct characters47
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4076 ?
Unique (%)0.9%

Sample

1st rowKIA MOTORS AMERICA INC
2nd rowKIA MOTORS AMERICA INC
3rd rowFINANCIAL SERVICES REMARKETING (LEASE)
4th rowVOLVO NA REP/WORLD OMNI
5th rowFINANCIAL SERVICES REMARKETING (LEASE)
ValueCountFrequency (%)
inc 64357
 
4.2%
corporation 42443
 
2.7%
credit 41755
 
2.7%
services 41236
 
2.7%
motor 39756
 
2.6%
llc 39159
 
2.5%
financial 37352
 
2.4%
auto 34572
 
2.2%
ford 31378
 
2.0%
remarketing 28461
 
1.8%
Other values (7362) 1143076
74.1%
2024-11-13T20:41:24.998200image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1099827
 
10.4%
E 947144
 
8.9%
A 857349
 
8.1%
R 804906
 
7.6%
N 783153
 
7.4%
I 758418
 
7.2%
O 719484
 
6.8%
T 662700
 
6.3%
C 608673
 
5.7%
S 546262
 
5.2%
Other values (37) 2804051
26.5%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 9321535
88.0%
Space Separator 1099827
 
10.4%
Other Punctuation 124312
 
1.2%
Dash Punctuation 28295
 
0.3%
Decimal Number 7351
 
0.1%
Close Punctuation 5319
 
0.1%
Open Punctuation 5319
 
0.1%
Math Symbol 9
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E 947144
10.2%
A 857349
 
9.2%
R 804906
 
8.6%
N 783153
 
8.4%
I 758418
 
8.1%
O 719484
 
7.7%
T 662700
 
7.1%
C 608673
 
6.5%
S 546262
 
5.9%
L 469207
 
5.0%
Other values (16) 2164239
23.2%
Decimal Number
ValueCountFrequency (%)
2 2326
31.6%
1 1566
21.3%
0 1033
14.1%
9 501
 
6.8%
5 494
 
6.7%
8 444
 
6.0%
4 371
 
5.0%
3 319
 
4.3%
6 225
 
3.1%
7 72
 
1.0%
Other Punctuation
ValueCountFrequency (%)
/ 89404
71.9%
. 23588
 
19.0%
& 7413
 
6.0%
' 2248
 
1.8%
# 1658
 
1.3%
: 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
1099827
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 28295
100.0%
Close Punctuation
ValueCountFrequency (%)
) 5319
100.0%
Open Punctuation
ValueCountFrequency (%)
( 5319
100.0%
Math Symbol
ValueCountFrequency (%)
+ 9
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 9321535
88.0%
Common 1270432
 
12.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
E 947144
10.2%
A 857349
 
9.2%
R 804906
 
8.6%
N 783153
 
8.4%
I 758418
 
8.1%
O 719484
 
7.7%
T 662700
 
7.1%
C 608673
 
6.5%
S 546262
 
5.9%
L 469207
 
5.0%
Other values (16) 2164239
23.2%
Common
ValueCountFrequency (%)
1099827
86.6%
/ 89404
 
7.0%
- 28295
 
2.2%
. 23588
 
1.9%
& 7413
 
0.6%
) 5319
 
0.4%
( 5319
 
0.4%
2 2326
 
0.2%
' 2248
 
0.2%
# 1658
 
0.1%
Other values (11) 5035
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10591967
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1099827
 
10.4%
E 947144
 
8.9%
A 857349
 
8.1%
R 804906
 
7.6%
N 783153
 
7.4%
I 758418
 
7.2%
O 719484
 
6.8%
T 662700
 
6.3%
C 608673
 
5.7%
S 546262
 
5.2%
Other values (37) 2804051
26.5%

mmr
Real number (ℝ)

HIGH CORRELATION 

Distinct1098
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13829.019
Minimum25
Maximum182000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.0 MiB
2024-11-13T20:41:25.068470image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

Quantile statistics

Minimum25
5-th percentile1950
Q17425
median12300
Q318250
95-th percentile30500
Maximum182000
Range181975
Interquartile range (IQR)10825

Descriptive statistics

Standard deviation9541.0782
Coefficient of variation (CV)0.68993168
Kurtosis12.352817
Mean13829.019
Median Absolute Deviation (MAD)5400
Skewness2.0512269
Sum6.3523183 × 109
Variance91032172
MonotonicityNot monotonic
2024-11-13T20:41:25.136244image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
11650 1505
 
0.3%
12500 1487
 
0.3%
11600 1486
 
0.3%
11750 1478
 
0.3%
11850 1476
 
0.3%
11300 1473
 
0.3%
12700 1455
 
0.3%
12050 1453
 
0.3%
11050 1450
 
0.3%
12350 1448
 
0.3%
Other values (1088) 444636
96.8%
ValueCountFrequency (%)
25 16
 
< 0.1%
50 38
< 0.1%
75 18
 
< 0.1%
100 25
< 0.1%
125 21
 
< 0.1%
150 32
< 0.1%
175 44
< 0.1%
200 40
< 0.1%
225 35
< 0.1%
250 61
< 0.1%
ValueCountFrequency (%)
182000 1
 
< 0.1%
178000 1
 
< 0.1%
176000 1
 
< 0.1%
170000 3
< 0.1%
166000 2
< 0.1%
164000 1
 
< 0.1%
163000 1
 
< 0.1%
162000 1
 
< 0.1%
161000 1
 
< 0.1%
160000 2
< 0.1%

sellingprice
Real number (ℝ)

HIGH CORRELATION 

Distinct1785
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean13687.786
Minimum1
Maximum230000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size7.0 MiB
2024-11-13T20:41:25.204017image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1600
Q17200
median12200
Q318200
95-th percentile30500
Maximum230000
Range229999
Interquartile range (IQR)11000

Descriptive statistics

Standard deviation9620.281
Coefficient of variation (CV)0.70283691
Kurtosis12.080594
Mean13687.786
Median Absolute Deviation (MAD)5500
Skewness2.0054681
Sum6.2874433 × 109
Variance92549807
MonotonicityNot monotonic
2024-11-13T20:41:25.268800image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
12000 3782
 
0.8%
11000 3723
 
0.8%
13000 3687
 
0.8%
10000 3438
 
0.7%
11500 3347
 
0.7%
14000 3242
 
0.7%
12500 3175
 
0.7%
9000 3097
 
0.7%
10500 2994
 
0.7%
9500 2846
 
0.6%
Other values (1775) 426016
92.7%
ValueCountFrequency (%)
1 1
 
< 0.1%
100 13
 
< 0.1%
150 16
 
< 0.1%
175 7
 
< 0.1%
200 130
 
< 0.1%
225 79
 
< 0.1%
250 211
 
< 0.1%
275 88
 
< 0.1%
300 927
0.2%
325 151
 
< 0.1%
ValueCountFrequency (%)
230000 1
< 0.1%
183000 1
< 0.1%
173000 1
< 0.1%
171500 1
< 0.1%
169500 1
< 0.1%
169000 1
< 0.1%
167000 1
< 0.1%
165000 2
< 0.1%
163000 2
< 0.1%
161000 1
< 0.1%
Distinct3585
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size7.0 MiB
Minimum2014-01-01 01:15:00
Maximum2015-07-20 19:30:00
2024-11-13T20:41:25.334130image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T20:41:25.405890image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

Interactions

2024-11-13T20:41:20.751173image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T20:41:18.496047image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T20:41:18.937571image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T20:41:19.386071image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T20:41:19.826104image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T20:41:20.281845image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T20:41:20.825923image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T20:41:18.567807image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T20:41:19.011324image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T20:41:19.456341image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T20:41:19.902375image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T20:41:20.359092image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T20:41:20.907649image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T20:41:18.640564image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T20:41:19.085078image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T20:41:19.529098image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T20:41:19.978121image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T20:41:20.438493image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T20:41:20.981403image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T20:41:18.714317image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T20:41:19.159827image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T20:41:19.600857image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T20:41:20.049882image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T20:41:20.525927image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T20:41:21.056153image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T20:41:18.788071image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T20:41:19.235575image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T20:41:19.677601image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T20:41:20.127129image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T20:41:20.599679image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T20:41:21.131900image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T20:41:18.864814image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T20:41:19.312317image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T20:41:19.754344image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T20:41:20.208091image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
2024-11-13T20:41:20.675426image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/

Correlations

2024-11-13T20:41:25.453031image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Unnamed: 0bodycolorconditioninteriormmrodometersellingpricestatetransmissionyear
Unnamed: 01.0000.0190.0130.0270.0130.036-0.0200.0420.0730.0120.067
body0.0191.0000.0710.0620.0730.1490.0660.1290.0490.2340.087
color0.0130.0711.0000.0570.0930.0550.0650.0480.0670.0810.092
condition0.0270.0620.0571.0000.0530.424-0.4070.4780.0860.0330.388
interior0.0130.0730.0930.0531.0000.0610.0850.0610.0590.0770.104
mmr0.0360.1490.0550.4240.0611.000-0.7130.9800.0610.0290.686
odometer-0.0200.0660.065-0.4070.085-0.7131.000-0.7010.0880.032-0.812
sellingprice0.0420.1290.0480.4780.0610.980-0.7011.0000.0500.0150.669
state0.0730.0490.0670.0860.0590.0610.0880.0501.0000.0780.096
transmission0.0120.2340.0810.0330.0770.0290.0320.0150.0781.0000.092
year0.0670.0870.0920.3880.1040.686-0.8120.6690.0960.0921.000

Missing values

2024-11-13T20:41:21.272430image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
A simple visualization of nullity by column.
2024-11-13T20:41:21.655150image/svg+xmlMatplotlib v3.9.0, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

Unnamed: 0yearmakemodeltrimbodytransmissionvinstateconditionodometercolorinteriorsellermmrsellingpricesaledate
002015KIASORENTOLXSUVAUTOMATIC5xyktca69fg566472CA5.016639.0WHITEBLACKKIA MOTORS AMERICA INC20500.021500.02014-12-16 04:30:00
112015KIASORENTOLXSUVAUTOMATIC5xyktca69fg561319CA5.09393.0WHITEBEIGEKIA MOTORS AMERICA INC20800.021500.02014-12-16 04:30:00
222014BMW3 SERIES328I SULEVSEDANAUTOMATICwba3c1c51ek116351CA45.01331.0GRAYBLACKFINANCIAL SERVICES REMARKETING (LEASE)31900.030000.02015-01-14 20:30:00
332015VOLVOS60T5SEDANAUTOMATICyv1612tb4f1310987CA41.014282.0WHITEBLACKVOLVO NA REP/WORLD OMNI27500.027750.02015-01-28 20:30:00
442014BMW6 SERIES GRAN COUPE650ISEDANAUTOMATICwba6b2c57ed129731CA43.02641.0GRAYBLACKFINANCIAL SERVICES REMARKETING (LEASE)66000.067000.02014-12-18 04:30:00
552015NISSANALTIMA2.5 SSEDANAUTOMATIC1n4al3ap1fn326013CA1.05554.0GRAYBLACKENTERPRISE VEHICLE EXCHANGE / TRA / RENTAL / TULSA15350.010900.02014-12-30 04:00:00
662014BMWM5BASESEDANAUTOMATICwbsfv9c51ed593089CA34.014943.0BLACKBLACKTHE HERTZ CORPORATION69000.065000.02014-12-17 04:30:00
772014CHEVROLETCRUZE1LTSEDANAUTOMATIC1g1pc5sb2e7128460CA2.028617.0BLACKBLACKENTERPRISE VEHICLE EXCHANGE / TRA / RENTAL / TULSA11900.09800.02014-12-16 05:00:00
882014AUDIA42.0T PREMIUM PLUS QUATTROSEDANAUTOMATICwauffafl3en030343CA42.09557.0WHITEBLACKAUDI MISSION VIEJO32100.032250.02014-12-18 04:00:00
992014CHEVROLETCAMAROLTCONVERTIBLEAUTOMATIC2g1fb3d37e9218789CA3.04809.0REDBLACKD/M AUTO SALES INC26300.017500.02015-01-19 20:00:00
Unnamed: 0yearmakemodeltrimbodytransmissionvinstateconditionodometercolorinteriorsellermmrsellingpricesaledate
5588245588242013AUDIS5PREMIUM PLUS QUATTROCONVERTIBLEAUTOMATICwaucgafh6dn005382FL5.020158.0SILVERBLACKPRESTIGE AUDI43900.042000.02015-07-08 23:00:00
5588255588252011SUBARUFORESTER2.5XSUVMANUALjf2shbac9bg741815CA41.071693.0SILVERBLACKREMARKETING BY GE/BILLION DODGE12300.011750.02015-07-08 02:30:00
5588265588262014JEEPGRAND CHEROKEELIMITEDSUVAUTOMATIC1c4rjebg4ec573100CA44.09024.0GRAYBLACKENTERPRISE VEHICLE EXCHANGE / TRA / RENTAL / TULSA29800.017300.02015-07-09 02:00:00
5588275588272014JEEPGRAND CHEROKEELAREDOSUVAUTOMATIC1c4rjfag0ec466276PA42.025180.0GRAYBLACKHERTZ CORPORATION/GDP26000.024500.02015-07-06 23:30:00
5588285588282012DODGEGRAND CARAVANAMERICAN VALUE PACKAGEMINIVANAUTOMATIC2c4rdgbg1cr349287MA37.097036.0SILVERGRAYGE FLEET SERVICES FOR ITSELF/SERVICER8300.07800.02015-07-06 23:30:00
5588315588312011BMW5 SERIES528ISEDANAUTOMATICwbafr1c53bc744672FL39.066403.0WHITEBROWNLAUDERDALE IMPORTS LTD BMW PEMBROK PINES20300.022800.02015-07-06 23:15:00
5588335588332012RAM2500POWER WAGONCREW CABAUTOMATIC3c6td5et6cg112407WA5.054393.0WHITEBLACKI -5 UHLMANN RV30200.030800.02015-07-08 02:30:00
5588345588342012BMWX5XDRIVE35DSUVAUTOMATIC5uxzw0c58cl668465CA48.050561.0BLACKBLACKFINANCIAL SERVICES REMARKETING (LEASE)29800.034000.02015-07-08 02:30:00
5588355588352015NISSANALTIMA2.5 SSEDANAUTOMATIC1n4al3ap0fc216050GA38.016658.0WHITEBLACKENTERPRISE VEHICLE EXCHANGE / TRA / RENTAL / TULSA15100.011100.02015-07-08 23:45:00
5588365588362014FORDF-150XLTSUPERCREWAUTOMATIC1ftfw1et2eke87277CA34.015008.0GRAYGRAYFORD MOTOR CREDIT COMPANY LLC PD29600.026700.02015-05-27 22:30:00